Clean the environment.
Set locations, and the working directory …
Defining phenotypes and datasets.
Create a new analysis directory, including subdirectories.
[1] FALSE
[1] FALSE
[1] FALSE
[1] FALSE
Setting working directory and listing its contents.
[1] "/Users/slaan3/PLINK/analyses/lookups/AE_20190912_010_MDICHGANS_SWVDLAAN_IL6_MCP1/scRNAseq"
[1] "20200612.AESCRNA.scrnaseq_results.RData" "AESCRNA" "scRNAseq.nb.html" "scRNAseq.Rmd"
[5] "scRNAseq.Rproj"
… a package-installation function …
… and load those packages.
We will create a datestamp and define the Utrecht Science Park Colour Scheme.
For the ERA-CVD ‘druggable-MI-targets’ project (grantnumber: 01KL1802) we will perform two related RNA sequencing (RNAseq) experiments:
conventional (‘bulk’) RNAseq using RNA extracted from carotid plaque samples, n ± 700. As of Friday, June 12, 2020 all samples have been selected and RNA has been extracted; quality control (QC) was performed and we have a dataset of 635 samples.
single-cell RNAseq (scRNAseq) of at least n = 40 samples (20 females, 20 males). As of Friday, June 12, 2020 data is available of 40 samples (3 females, 15 males), we are extending sampling to get more female samples.
Plaque samples are derived from carotid endarterectomies as part of the Athero-Express Biobank Study which is an ongoing study in the UMC Utrecht.
Using a Mendelian Randomization approach, we recently examined associations between the circulating levels of 41 cytokines and growth factors and the risk of stroke in the MEGASTROKE GWAS dataset (67,000 stroke cases and 450,000 controls) and found TARGET_A as the cytokine showing the strongest association with stroke, particularly large artery and cardioembolic stroke1. Genetically elevated MCP-1 levels were also associated with a higher risk of coronary artery disease (CAD) and myocardial infarction (MI)2. Further, in a meta-analysis of observational population-based of longitudinal cohort studies we recently showed that baseline levels of TARGET_A were associated with a higher risk of ischemic stroke over follow-up3. TARGET_A
While these data suggest a central role of TARGET_A in the pathogenesis of atherosclerosis, it remains unknown if TARGET_A levels in the blood really reflect TARGET_A activity. TARGET_A is expressed in the atherosclerotic plaque and attracts monocytes in the subendothelial space4567. Thus, TARGET_A levels in the plaque might more strongly reflect TARGET_A signaling. However, it remains unknown which cells in the plaques are interacting with the circulating monocytes.
In this project we aim to map these genes to individual cells from carotid endarterectomy patients.
First we will load the data:
Here we load the latest dataset from our Athero-Express Single Cell RNA experiment.
scRNAseqData <- readRDS(paste0(RAWDATA, "/Seuset_40_patients/Seuset_40_patients.RDS"))
scRNAseqData
An object of class Seurat
38835 features across 6191 samples within 2 assays
Active assay: SCT (18283 features, 3000 variable features)
1 other assay present: RNA
2 dimensional reductions calculated: pca, umap
N_GENES=18283
The naming/classification is based on a combination conventional markers. We do not claim to know the exact identity of each cell, rather we refer to cells as ‘KIT+ Mast cells"-like cells. Likewise we refer to the cell clusters as ’communities’ of cells that exihibit similar properties, i.e. similar defining markers (e.g. KIT).
We will rename the cell types to human readable names.
### change names for clarity
backup.scRNAseqData = scRNAseqData
# get the old names to change to new names
UMAPPlot(scRNAseqData, label = FALSE, pt.size = 1.25, label.size = 4, group.by = "ident")
unique(scRNAseqData@active.ident)
[1] CD14+CD68+ Macrophages II CD3+CD4+ T Cells II CD14+CD68+ Macrophages I CD3+CD8+ T cells I CD34+ Endothelial Cells II CD3+CD8A+ T Cells II
[7] Mixed Cells II Mixed Cells I CD14+CD68+ Macrophages III NCAM1+ Natural Killer Cells ACTA2+ Smooth Muscle Cells CD34+ Endothelial Cells I
[13] CD3+CD4+ T Cells III KIT+ Mast Cells CD79A+ B Cells I CD3+CD4+ T Cells I CD3+CD8 T cells III CD79A+ B Cells II
18 Levels: CD3+CD8 T cells III CD79A+ B Cells II KIT+ Mast Cells CD3+CD4+ T Cells III CD14+CD68+ Macrophages III CD79A+ B Cells I Mixed Cells II ... CD3+CD8+ T cells I
celltypes <- c("CD14+CD68+ Macrophages I" = "CD14+CD68+ M I",
"CD14+CD68+ Macrophages II" = "CD14+CD68+ M II",
"CD14+CD68+ Macrophages III" = "CD14+CD68+ M III",
"CD3+CD8+ T cells I" = "CD3+CD8+ T I",
"CD3+CD8A+ T Cells II" = "CD3+CD8A+ T II ",
"CD3+CD8 T cells III" = "CD3+CD8 T III",
"CD3+CD4+ T Cells I" = "CD3+CD4+ T I",
"CD3+CD4+ T Cells II" = "CD3+CD4+ T II",
"CD3+CD4+ T Cells III" = "CD3+CD4+ T III",
"CD34+ Endothelial Cells I" = "CD34+ EC I",
"CD34+ Endothelial Cells II" = "CD34+ EC II",
"Mixed Cells I" = "Mixed I",
"Mixed Cells II" = "Mixed II",
"ACTA2+ Smooth Muscle Cells" = "ACTA2+ SMC",
"NCAM1+ Natural Killer Cells" = "NCAM1+ NK",
"KIT+ Mast Cells" = "KIT+ MC",
"CD79A+ B Cells I" = "CD79A+ B I",
"CD79A+ B Cells II" = "CD79A+ B II")
scRNAseqData <- Seurat::RenameIdents(object = scRNAseqData,
celltypes)
UMAPPlot(scRNAseqData, label = TRUE, pt.size = 1.25, label.size = 4, group.by = "ident",
repel = TRUE)
Loading Athero-Express clinical data.
require(haven)
# AEDB <- haven::read_sav(paste0(AEDB_loc, "/2019-3NEW_AtheroExpressDatabase_ScientificAE_02072019_IC_added.sav"))
AEDB <- haven::read_sav(paste0(AEDB_loc, "/2020_1_NEW_AtheroExpressDatabase_ScientificAE_16-03-2020.sav"))
We need to be very strict in defining symptoms. Therefore we will fix a new variable that groups symptoms at inclusion.
Coding of symptoms is as follows:
We will group as follows:
# Fix symptoms
attach(AEDB)
AEDB[,"Symptoms.5G"] <- NA
AEDB$Symptoms.5G[sympt == 0] <- "Asymptomatic"
AEDB$Symptoms.5G[sympt == 1 | sympt == 7 | sympt == 13] <- "TIA"
AEDB$Symptoms.5G[sympt == 2 | sympt == 3] <- "Stroke"
AEDB$Symptoms.5G[sympt == 4 | sympt == 14 | sympt == 15 ] <- "Ocular"
AEDB$Symptoms.5G[sympt == 8 | sympt == 11] <- "Retinal infarction"
AEDB$Symptoms.5G[sympt == 5 | sympt == 9 | sympt == 10 | sympt == 12 | sympt == 16 | sympt == 17] <- "Other"
# AsymptSympt
AEDB[,"AsymptSympt"] <- NA
AEDB$AsymptSympt[sympt == -999] <- NA
AEDB$AsymptSympt[sympt == 0] <- "Asymptomatic"
AEDB$AsymptSympt[sympt == 1 | sympt == 7 | sympt == 13 | sympt == 2 | sympt == 3] <- "Symptomatic"
AEDB$AsymptSympt[sympt == 4 | sympt == 14 | sympt == 15 | sympt == 8 | sympt == 11 | sympt == 5 | sympt == 9 | sympt == 10 | sympt == 12 | sympt == 16 | sympt == 17] <- "Ocular and others"
# AsymptSympt
AEDB[,"AsymptSympt2G"] <- NA
AEDB$AsymptSympt2G[sympt == -999] <- NA
AEDB$AsymptSympt2G[sympt == 0] <- "Asymptomatic"
AEDB$AsymptSympt2G[sympt == 1 | sympt == 7 | sympt == 13 | sympt == 2 | sympt == 3 | sympt == 4 | sympt == 14 | sympt == 15 | sympt == 8 | sympt == 11 | sympt == 5 | sympt == 9 | sympt == 10 | sympt == 12 | sympt == 16 | sympt == 17] <- "Symptomatic"
detach(AEDB)
# table(AEDB$sympt, useNA = "ifany")
# table(AEDB$AsymptSympt2G, useNA = "ifany")
# table(AEDB$Symptoms.5G, useNA = "ifany")
#
# table(AEDB$AsymptSympt2G, AEDB$sympt, useNA = "ifany")
# table(AEDB$Symptoms.5G, AEDB$sympt, useNA = "ifany")
table(AEDB$AsymptSympt2G, AEDB$Symptoms.5G, useNA = "ifany")
Asymptomatic Ocular Other Retinal infarction Stroke TIA <NA>
Asymptomatic 333 0 0 0 0 0 0
Symptomatic 0 416 119 43 732 1045 0
<NA> 0 0 0 0 0 0 1103
# AEDB.temp <- subset(AEDB, select = c("STUDY_NUMBER", "UPID", "Age", "Gender", "Hospital", "Artery_summary", "sympt", "Symptoms.5G", "AsymptSympt"))
# require(labelled)
# AEDB.temp$Gender <- to_factor(AEDB.temp$Gender)
# AEDB.temp$Hospital <- to_factor(AEDB.temp$Hospital)
# AEDB.temp$Artery_summary <- to_factor(AEDB.temp$Artery_summary)
#
# DT::datatable(AEDB.temp[1:10,], caption = "Excerpt of the whole AEDB.", rownames = FALSE)
#
# table(AEDB.temp$Symptoms.5G, AEDB.temp$AsymptSympt)
#
# rm(AEDB.temp)
We will also fix the plaquephenotypes variable.
Coding of symptoms is as follows:
# Fix plaquephenotypes
attach(AEDB)
AEDB[,"OverallPlaquePhenotype"] <- NA
AEDB$OverallPlaquePhenotype[plaquephenotype == -999] <- NA
AEDB$OverallPlaquePhenotype[plaquephenotype == -999] <- NA
AEDB$OverallPlaquePhenotype[plaquephenotype == 1] <- "fibrous"
AEDB$OverallPlaquePhenotype[plaquephenotype == 2] <- "fibroatheromatous"
AEDB$OverallPlaquePhenotype[plaquephenotype == 3] <- "atheromatous"
detach(AEDB)
# AEDB.temp <- subset(AEDB, select = c("STUDY_NUMBER", "UPID", "Age", "Gender", "Hospital", "Artery_summary", "plaquephenotype", "OverallPlaquePhenotype"))
# require(labelled)
# AEDB.temp$Gender <- to_factor(AEDB.temp$Gender)
# AEDB.temp$Hospital <- to_factor(AEDB.temp$Hospital)
# AEDB.temp$Artery_summary <- to_factor(AEDB.temp$Artery_summary)
#
# DT::datatable(AEDB.temp[1:10,], caption = "Excerpt of the whole AEDB.", rownames = FALSE)
#
# rm(AEDB.temp)
We will also fix the diabetes status variable.
# Fix diabetes
attach(AEDB)
AEDB[,"DiabetesStatus"] <- NA
AEDB$DiabetesStatus[DM.composite == -999] <- NA
AEDB$DiabetesStatus[DM.composite == 0] <- "Control (no Diabetes Dx/Med)"
AEDB$DiabetesStatus[DM.composite == 1] <- "Diabetes"
detach(AEDB)
# AEDB.temp <- subset(AEDB, select = c("STUDY_NUMBER", "UPID", "Age", "Gender", "Hospital", "Artery_summary", "DM.composite", "DiabetesStatus"))
# require(labelled)
# AEDB.temp$Gender <- to_factor(AEDB.temp$Gender)
# AEDB.temp$Hospital <- to_factor(AEDB.temp$Hospital)
# AEDB.temp$Artery_summary <- to_factor(AEDB.temp$Artery_summary)
# AEDB.temp$DiabetesStatus <- to_factor(AEDB.temp$DiabetesStatus)
#
# DT::datatable(AEDB.temp[1:10,], caption = "Excerpt of the whole AEDB.", rownames = FALSE)
#
# rm(AEDB.temp)
We will also fix the smoking status variable. We are interested in whether someone never, ever or is currently (at the time of inclusion) smoking. This is based on the questionnaire.
diet801: are you a smoker?diet802: did you smoke in the past?We already have some variables indicating smoking status:
SmokingReported: patient has reported to smoke.SmokingYearOR: smoking in the year of surgery?SmokerCurrent: currently smoking?require(labelled)
AEDB$diet801 <- to_factor(AEDB$diet801)
AEDB$diet802 <- to_factor(AEDB$diet802)
AEDB$diet805 <- to_factor(AEDB$diet805)
AEDB$SmokingReported <- to_factor(AEDB$SmokingReported)
AEDB$SmokerCurrent <- to_factor(AEDB$SmokerCurrent)
AEDB$SmokingYearOR <- to_factor(AEDB$SmokingYearOR)
# table(AEDB$diet801)
# table(AEDB$diet802)
# table(AEDB$SmokingReported)
# table(AEDB$SmokerCurrent)
# table(AEDB$SmokingYearOR)
# table(AEDB$SmokingReported, AEDB$SmokerCurrent, useNA = "ifany", dnn = c("Reported smoking", "Current smoker"))
#
# table(AEDB$diet801, AEDB$diet802, useNA = "ifany", dnn = c("Smoker", "Past smoker"))
cat("\nFixing smoking status.\n")
Fixing smoking status.
attach(AEDB)
AEDB[,"SmokerStatus"] <- NA
AEDB$SmokerStatus[diet802 == "don't know"] <- "Never smoked"
AEDB$SmokerStatus[diet802 == "I still smoke"] <- "Current smoker"
AEDB$SmokerStatus[SmokerCurrent == "no" & diet802 == "no"] <- "Never smoked"
AEDB$SmokerStatus[SmokerCurrent == "no" & diet802 == "yes"] <- "Ex-smoker"
AEDB$SmokerStatus[SmokerCurrent == "yes"] <- "Current smoker"
AEDB$SmokerStatus[SmokerCurrent == "no data available/missing"] <- NA
# AEDB$SmokerStatus[is.na(SmokerCurrent)] <- "Never smoked"
detach(AEDB)
cat("\n* Current smoking status.\n")
* Current smoking status.
table(AEDB$SmokerCurrent,
useNA = "ifany",
dnn = c("Current smoker"))
Current smoker
no data available/missing no yes <NA>
0 2364 1308 119
cat("\n* Updated smoking status.\n")
* Updated smoking status.
table(AEDB$SmokerStatus,
useNA = "ifany",
dnn = c("Updated smoking status"))
Updated smoking status
Current smoker Ex-smoker Never smoked <NA>
1308 1814 389 280
cat("\n* Comparing to 'SmokerCurrent'.\n")
* Comparing to 'SmokerCurrent'.
table(AEDB$SmokerStatus, AEDB$SmokerCurrent,
useNA = "ifany",
dnn = c("Updated smoking status", "Current smoker"))
Current smoker
Updated smoking status no data available/missing no yes <NA>
Current smoker 0 0 1308 0
Ex-smoker 0 1814 0 0
Never smoked 0 389 0 0
<NA> 0 161 0 119
# AEDB.temp <- subset(AEDB, select = c("STUDY_NUMBER", "UPID", "Age", "Gender", "Hospital", "Artery_summary", "DM.composite", "DiabetesStatus"))
# require(labelled)
# AEDB.temp$Gender <- to_factor(AEDB.temp$Gender)
# AEDB.temp$Hospital <- to_factor(AEDB.temp$Hospital)
# AEDB.temp$Artery_summary <- to_factor(AEDB.temp$Artery_summary)
# AEDB.temp$DiabetesStatus <- to_factor(AEDB.temp$DiabetesStatus)
#
# DT::datatable(AEDB.temp[1:10,], caption = "Excerpt of the whole AEDB.", rownames = FALSE)
#
# rm(AEDB.temp)
We will also fix the alcohol status variable.
# Fix diabetes
attach(AEDB)
AEDB[,"AlcoholUse"] <- NA
AEDB$AlcoholUse[diet810 == -999] <- NA
AEDB$AlcoholUse[diet810 == 0] <- "No"
AEDB$AlcoholUse[diet810 == 1] <- "Yes"
detach(AEDB)
# AEDB.temp <- subset(AEDB, select = c("STUDY_NUMBER", "UPID", "Age", "Gender", "Hospital", "Artery_summary", "diet810", "AlcoholUse"))
# require(labelled)
# AEDB.temp$Gender <- to_factor(AEDB.temp$Gender)
# AEDB.temp$Hospital <- to_factor(AEDB.temp$Hospital)
# AEDB.temp$Artery_summary <- to_factor(AEDB.temp$Artery_summary)
# AEDB.temp$AlcoholUse <- to_factor(AEDB.temp$AlcoholUse)
#
# DT::datatable(AEDB.temp[1:10,], caption = "Excerpt of the whole AEDB.", rownames = FALSE)
#
# rm(AEDB.temp)
We are interested in the following variables at baseline.
cat("====================================================================================================\n")
====================================================================================================
cat("SELECTION THE SHIZZLE\n")
SELECTION THE SHIZZLE
### Artery levels
# AEdata$Artery_summary:
# value label
# NOT USE - 0 No artery known (yet), no surgery (patient ill, died, exited study), re-numbered to AAA
# USE - 1 carotid (left & right)
# USE - 2 femoral/iliac (left, right or both sides)
# NOT USE - 3 other carotid arteries (common, external)
# NOT USE - 4 carotid bypass and injury (left, right or both sides)
# NOT USE - 5 aneurysmata (carotid & femoral)
# NOT USE - 6 aorta
# NOT USE - 7 other arteries (renal, popliteal, vertebral)
# NOT USE - 8 femoral bypass, angioseal and injury (left, right or both sides)
### AEdata$informedconsent
# value label
# NOT USE - -999 missing
# NOT USE - 0 no, died
# USE - 1 yes
# USE - 2 yes, health treatment when possible
# USE - 3 yes, no health treatment
# USE - 4 yes, no health treatment, no commercial business
# NOT USE - 5 yes, no tissue, no commerical business
# NOT USE - 6 yes, no tissue, no questionnaires, no medical info, no commercial business
# USE - 7 yes, no questionnaires, no health treatment, no commercial business
# USE - 8 yes, no questionnaires, health treatment when possible
# NOT USE - 9 yes, no tissue, no questionnaires, no health treatment, no commerical business
# USE - 10 yes, no health treatment, no medical info, no commercial business
# NOT USE - 11 yes, no tissue, no questionnaires, no health treatment, no medical info, no commercial business
# USE - 12 yes, no questionnaires, no health treatment
# NOT USE - 13 yes, no tissue, no health treatment
# NOT USE - 14 yes, no tissue, no questionnaires
# NOT USE - 15 yes, no tissue, health treatment when possible
# NOT USE - 16 yes, no tissue
# USE - 17 yes, no commerical business
# USE - 18 yes, health treatment when possible, no commercial business
# USE - 19 yes, no medical info, no commercial business
# USE - 20 yes, no questionnaires
# NOT USE - 21 yes, no tissue, no questionnaires, no health treatment, no medical info
# NOT USE - 22 yes, no tissue, no questionnaires, no health treatment, no commercial business
# USE - 23 yes, no medical info
# USE - 24 yes, no questionnaires, no commercial business
# USE - 25 yes, no questionnaires, no health treatment, no medical info
# USE - 26 yes, no questionnaires, health treatment when possible, no commercial business
# USE - 27 yes, no health treatment, no medical info
# NOT USE - 28 no, doesn't want to
# NOT USE - 29 no, unable to sign
# NOT USE - 30 no, no reaction
# NOT USE - 31 no, lost
# NOT USE - 32 no, too old
# NOT USE - 34 yes, no medical info, health treatment when possible
# NOT USE - 35 no (never asked for IC because there was no tissue)
# USE - 36 yes, no medical info, no commercial business, health treatment when possible
# NOT USE - 37 no, endpoint
# USE - 38 wil niets invullen, wel alles gebruiken
# USE - 39 second informed concents: yes, no commercial business
# NOT USE - 40 nooit geincludeerd
cat("- sanity checking PRIOR to selection")
- sanity checking PRIOR to selection
library(data.table)
require(labelled)
ae.gender <- to_factor(AEDB$Gender)
ae.hospital <- to_factor(AEDB$Hospital)
table(ae.gender, ae.hospital, dnn = c("Sex", "Hospital"))
Hospital
Sex St. Antonius, Nieuwegein UMC Utrecht
female 524 636
male 1211 1420
ae.artery <- to_factor(AEDB$Artery_summary)
table(ae.artery, ae.gender, dnn = c("Sex", "Artery"))
Artery
Sex female male
No artery known (yet), no surgery (patient ill, died, exited study), re-numbered to AAA 0 0
carotid (left & right) 805 1781
femoral/iliac (left, right or both sides) 320 796
other carotid arteries (common, external) 17 35
carotid bypass and injury (left, right or both sides) 6 3
aneurysmata (carotid & femoral) 1 0
aorta 3 5
other arteries (renal, popliteal, vertebral) 4 9
femoral bypass, angioseal and injury (left, right or both sides) 4 2
rm(ae.gender, ae.hospital, ae.artery)
# I change numeric and factors manually because, well, I wouldn't know how to fix it otherwise
# to have this 'tibble' work with 'tableone'... :-)
AEDB$Age <- as.numeric(AEDB$Age)
AEDB$diastoli <- as.numeric(AEDB$diastoli)
AEDB$systolic <- as.numeric(AEDB$systolic)
AEDB$TC_finalCU <- as.numeric(AEDB$TC_finalCU)
AEDB$LDL_finalCU <- as.numeric(AEDB$LDL_finalCU)
AEDB$HDL_finalCU <- as.numeric(AEDB$HDL_finalCU)
AEDB$TG_finalCU <- as.numeric(AEDB$TG_finalCU)
AEDB$TC_final <- as.numeric(AEDB$TC_final)
AEDB$LDL_final <- as.numeric(AEDB$LDL_final)
AEDB$HDL_final <- as.numeric(AEDB$HDL_final)
AEDB$TG_final <- as.numeric(AEDB$TG_final)
AEDB$Age <- as.numeric(AEDB$Age)
AEDB$GFR_MDRD <- as.numeric(AEDB$GFR_MDRD)
AEDB$BMI <- as.numeric(AEDB$BMI)
AEDB$eCigarettes <- as.numeric(AEDB$eCigarettes)
AEDB$ePackYearsSmoking <- as.numeric(AEDB$ePackYearsSmoking)
AEDB$EP_composite_time <- as.numeric(AEDB$EP_composite_time)
AEDB$macmean0 <- as.numeric(AEDB$macmean0)
AEDB$smcmean0 <- as.numeric(AEDB$smcmean0)
AEDB$neutrophils <- as.numeric(AEDB$neutrophils)
AEDB$Mast_cells_plaque <- as.numeric(AEDB$Mast_cells_plaque)
AEDB$vessel_density_averaged <- as.numeric(AEDB$vessel_density_averaged)
AEDB$IL6 <- as.numeric(AEDB$IL6)
AEDB$IL6_pg_ug_2015 <- as.numeric(AEDB$IL6_pg_ug_2015)
AEDB$IL6R_pg_ug_2015 <- as.numeric(AEDB$IL6R_pg_ug_2015)
AEDB$MCP1 <- as.numeric(AEDB$MCP1)
AEDB$MCP1_pg_ug_2015 <- as.numeric(AEDB$MCP1_pg_ug_2015)
AEDB$hsCRP_plasma <- as.numeric(AEDB$hsCRP_plasma)
require(labelled)
AEDB$ORyear <- to_factor(AEDB$ORyear)
AEDB$Gender <- to_factor(AEDB$Gender)
AEDB$Hospital <- to_factor(AEDB$Hospital)
AEDB$KDOQI <- to_factor(AEDB$KDOQI)
AEDB$BMI_WHO <- to_factor(AEDB$BMI_WHO)
AEDB$DiabetesStatus <- to_factor(AEDB$DiabetesStatus)
AEDB$SmokerStatus <- to_factor(AEDB$SmokerStatus)
AEDB$AlcoholUse <- to_factor(AEDB$AlcoholUse)
AEDB$Hypertension.selfreport <- to_factor(AEDB$Hypertension1)
AEDB$Hypertension.selfreportdrug <- to_factor(AEDB$Hypertension2)
AEDB$Hypertension.composite <- to_factor(AEDB$Hypertension.composite)
AEDB$Hypertension.drugs <- to_factor(AEDB$Hypertension.drugs)
AEDB$Med.anticoagulants <- to_factor(AEDB$Med.anticoagulants)
AEDB$Med.all.antiplatelet <- to_factor(AEDB$Med.all.antiplatelet)
AEDB$Med.Statin.LLD <- to_factor(AEDB$Med.Statin.LLD)
AEDB$Stroke_Dx <- to_factor(AEDB$Stroke_Dx)
AEDB$CAD_history <- to_factor(AEDB$CAD_history)
AEDB$PAOD <- to_factor(AEDB$PAOD)
AEDB$Peripheral.interv <- to_factor(AEDB$Peripheral.interv)
AEDB$sympt <- to_factor(AEDB$sympt)
AEDB$Symptoms.3g <- to_factor(AEDB$Symptoms.3g)
AEDB$Symptoms.4g <- to_factor(AEDB$Symptoms.4g)
AEDB$Symptoms.5G <- to_factor(AEDB$Symptoms.5G)
AEDB$AsymptSympt <- to_factor(AEDB$AsymptSympt)
AEDB$AsymptSympt2G <- to_factor(AEDB$AsymptSympt2G)
AEDB$restenos <- to_factor(AEDB$restenos)
AEDB$stenose <- to_factor(AEDB$stenose)
AEDB$EP_composite <- to_factor(AEDB$EP_composite)
AEDB$Macrophages.bin <- to_factor(AEDB$Macrophages.bin)
AEDB$SMC.bin <- to_factor(AEDB$SMC.bin)
AEDB$IPH.bin <- to_factor(AEDB$IPH.bin)
AEDB$Calc.bin <- to_factor(AEDB$Calc.bin)
AEDB$Collagen.bin <- to_factor(AEDB$Collagen.bin)
AEDB$Fat.bin_10 <- to_factor(AEDB$Fat.bin_10)
AEDB$Fat.bin_40 <- to_factor(AEDB$Fat.bin_40)
AEDB$OverallPlaquePhenotype <- to_factor(AEDB$OverallPlaquePhenotype)
AEDB$Artery_summary <- to_factor(AEDB$Artery_summary)
AEDB$informedconsent <- to_factor(AEDB$informedconsent)
AEDB.CEA <- subset(AEDB,
(Artery_summary == "carotid (left & right)" | Artery_summary == "other carotid arteries (common, external)") & # we only want carotids
informedconsent != "missing" & # we are really strict in selecting based on 'informed consent'!
informedconsent != "no, died" &
informedconsent != "yes, no tissue, no commerical business" &
informedconsent != "yes, no tissue, no questionnaires, no medical info, no commercial business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no commerical business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no medical info, no commercial business" &
informedconsent != "yes, no tissue, no health treatment" &
informedconsent != "yes, no tissue, no questionnaires" &
informedconsent != "yes, no tissue, health treatment when possible" &
informedconsent != "yes, no tissue" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no medical info" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no commercial business" &
informedconsent != "no, doesn't want to" &
informedconsent != "no, unable to sign" &
informedconsent != "no, no reaction" &
informedconsent != "no, lost" &
informedconsent != "no, too old" &
informedconsent != "yes, no medical info, health treatment when possible" &
informedconsent != "no (never asked for IC because there was no tissue)" &
informedconsent != "no, endpoint" &
informedconsent != "nooit geincludeerd" &
!is.na(AsymptSympt2G))
AEDB.CEA[1:10, 1:10]
dim(AEDB.CEA)
[1] 2421 1100
cat("===========================================================================================\n")
===========================================================================================
cat("CREATE BASELINE TABLE\n")
CREATE BASELINE TABLE
# Baseline table variables
basetable_vars = c("Hospital", "ORyear",
"Age", "Gender",
"TC_finalCU", "LDL_finalCU", "HDL_finalCU", "TG_finalCU",
"TC_final", "LDL_final", "HDL_final", "TG_final",
"hsCRP_plasma",
"systolic", "diastoli", "GFR_MDRD", "BMI",
"KDOQI", "BMI_WHO",
"SmokerStatus", "AlcoholUse",
"DiabetesStatus",
"Hypertension.selfreport", "Hypertension.selfreportdrug", "Hypertension.composite", "Hypertension.drugs",
"Med.anticoagulants", "Med.all.antiplatelet", "Med.Statin.LLD",
"Stroke_Dx", "sympt", "Symptoms.5G", "AsymptSympt", "AsymptSympt2G",
"restenos", "stenose",
"CAD_history", "PAOD", "Peripheral.interv",
"EP_composite", "EP_composite_time",
"macmean0", "smcmean0", "Macrophages.bin", "SMC.bin",
"neutrophils", "Mast_cells_plaque",
"IPH.bin", "vessel_density_averaged",
"Calc.bin", "Collagen.bin",
"Fat.bin_10", "Fat.bin_40", "OverallPlaquePhenotype",
"IL6", "IL6_pg_ug_2015", "IL6R_pg_ug_2015",
"MCP1", "MCP1_pg_ug_2015")
basetable_bin = c("Gender",
"KDOQI", "BMI_WHO",
"SmokerStatus", "AlcoholUse",
"DiabetesStatus",
"Hypertension.selfreport", "Hypertension.selfreportdrug", "Hypertension.composite", "Hypertension.drugs",
"Med.anticoagulants", "Med.all.antiplatelet", "Med.Statin.LLD",
"Stroke_Dx", "sympt", "Symptoms.5G", "AsymptSympt", "AsymptSympt2G",
"restenos", "stenose",
"CAD_history", "PAOD", "Peripheral.interv",
"EP_composite", "Macrophages.bin", "SMC.bin",
"IPH.bin",
"Calc.bin", "Collagen.bin",
"Fat.bin_10", "Fat.bin_40", "OverallPlaquePhenotype")
# basetable_bin
basetable_con = basetable_vars[!basetable_vars %in% basetable_bin]
# basetable_con
Showing the baseline table of the whole Athero-Express Biobank.
# Create baseline tables
# http://rstudio-pubs-static.s3.amazonaws.com/13321_da314633db924dc78986a850813a50d5.html
AEDB.CEA.tableOne = print(CreateTableOne(vars = basetable_vars,
# factorVars = basetable_bin,
strata = "AsymptSympt2G",
data = AEDB.CEA, includeNA = TRUE),
nonnormal = c(), missing = TRUE,
quote = FALSE, noSpaces = FALSE, showAllLevels = TRUE, explain = TRUE,
format = "pf",
contDigits = 3)[,1:6]
Stratified by AsymptSympt2G
level Asymptomatic Symptomatic p test Missing
n 270 2151
Hospital % (freq) St. Antonius, Nieuwegein 43.3 (117) 38.6 ( 831) 0.154 0.0
UMC Utrecht 56.7 (153) 61.4 (1320)
ORyear % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 0.0
2002 8.1 ( 22) 2.7 ( 59)
2003 8.9 ( 24) 6.2 ( 133)
2004 12.2 ( 33) 7.3 ( 157)
2005 8.5 ( 23) 7.5 ( 162)
2006 6.7 ( 18) 7.7 ( 165)
2007 6.7 ( 18) 6.2 ( 134)
2008 5.2 ( 14) 5.8 ( 124)
2009 7.0 ( 19) 7.5 ( 162)
2010 5.9 ( 16) 6.6 ( 143)
2011 4.8 ( 13) 7.0 ( 150)
2012 5.2 ( 14) 7.5 ( 162)
2013 3.7 ( 10) 6.5 ( 139)
2014 7.4 ( 20) 6.6 ( 143)
2015 1.9 ( 5) 3.3 ( 71)
2016 0.7 ( 2) 3.9 ( 83)
2017 4.1 ( 11) 2.5 ( 54)
2018 1.5 ( 4) 2.9 ( 62)
2019 1.5 ( 4) 2.2 ( 48)
Age (mean (SD)) 66.481 (8.695) 69.434 (9.326) <0.001 0.0
Gender % (freq) female 23.7 ( 64) 31.3 ( 674) 0.013 0.0
male 76.3 (206) 68.7 (1477)
TC_finalCU (mean (SD)) 180.295 (46.689) 185.299 (57.212) 0.303 38.0
LDL_finalCU (mean (SD)) 103.501 (38.426) 108.963 (42.075) 0.155 45.6
HDL_finalCU (mean (SD)) 45.431 (14.817) 46.549 (17.235) 0.456 41.7
TG_finalCU (mean (SD)) 162.426 (90.058) 149.914 (91.364) 0.119 42.8
TC_final (mean (SD)) 4.670 (1.209) 4.799 (1.482) 0.303 38.0
LDL_final (mean (SD)) 2.681 (0.995) 2.822 (1.090) 0.155 45.6
HDL_final (mean (SD)) 1.177 (0.384) 1.206 (0.446) 0.456 41.7
TG_final (mean (SD)) 1.835 (1.018) 1.694 (1.032) 0.119 42.8
hsCRP_plasma (mean (SD)) 7.121 (24.154) 21.751 (247.521) 0.480 53.0
systolic (mean (SD)) 150.017 (23.846) 152.711 (25.312) 0.123 11.3
diastoli (mean (SD)) 79.618 (12.216) 81.525 (26.329) 0.275 11.3
GFR_MDRD (mean (SD)) 72.629 (21.381) 73.182 (21.128) 0.697 5.4
BMI (mean (SD)) 26.712 (3.677) 26.460 (4.014) 0.336 5.9
KDOQI % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 5.5
Normal kidney function 18.9 ( 51) 19.1 ( 411)
CKD 2 (Mild) 47.0 (127) 51.4 (1105)
CKD 3 (Moderate) 25.6 ( 69) 22.5 ( 484)
CKD 4 (Severe) 0.0 ( 0) 1.5 ( 32)
CKD 5 (Failure) 0.7 ( 2) 0.4 ( 8)
<NA> 7.8 ( 21) 5.2 ( 111)
BMI_WHO % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 5.9
Underweight 0.7 ( 2) 1.0 ( 22)
Normal 31.1 ( 84) 35.6 ( 766)
Overweight 49.6 (134) 42.6 ( 917)
Obese 14.4 ( 39) 14.6 ( 313)
<NA> 4.1 ( 11) 6.2 ( 133)
SmokerStatus % (freq) Current smoker 27.0 ( 73) 33.9 ( 730) 0.024 5.9
Ex-smoker 56.3 (152) 47.0 (1011)
Never smoked 12.6 ( 34) 13.0 ( 279)
<NA> 4.1 ( 11) 6.1 ( 131)
AlcoholUse % (freq) No 34.8 ( 94) 34.4 ( 741) 0.631 4.0
Yes 62.2 (168) 61.4 (1320)
<NA> 3.0 ( 8) 4.2 ( 90)
DiabetesStatus % (freq) Control (no Diabetes Dx/Med) 76.3 (206) 75.0 (1614) 0.777 1.1
Diabetes 23.0 ( 62) 23.8 ( 512)
<NA> 0.7 ( 2) 1.2 ( 25)
Hypertension.selfreport % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 3.2
no 23.7 ( 64) 24.4 ( 525)
yes 74.1 (200) 72.2 (1554)
<NA> 2.2 ( 6) 3.3 ( 72)
Hypertension.selfreportdrug % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 4.4
no 30.0 ( 81) 29.9 ( 644)
yes 65.2 (176) 65.7 (1413)
<NA> 4.8 ( 13) 4.4 ( 94)
Hypertension.composite % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 1.2
no 11.5 ( 31) 15.0 ( 322)
yes 87.8 (237) 83.8 (1803)
<NA> 0.7 ( 2) 1.2 ( 26)
Hypertension.drugs % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 1.4
no 17.8 ( 48) 24.0 ( 517)
yes 81.1 (219) 74.6 (1604)
<NA> 1.1 ( 3) 1.4 ( 30)
Med.anticoagulants % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 1.6
no 88.5 (239) 87.2 (1875)
yes 10.4 ( 28) 11.2 ( 241)
<NA> 1.1 ( 3) 1.6 ( 35)
Med.all.antiplatelet % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 1.5
no 7.4 ( 20) 12.8 ( 275)
yes 91.1 (246) 85.7 (1844)
<NA> 1.5 ( 4) 1.5 ( 32)
Med.Statin.LLD % (freq) No data available/missing 0.0 ( 0) 0.0 ( 0) NaN 1.4
no 16.7 ( 45) 20.7 ( 446)
yes 82.2 (222) 77.8 (1674)
<NA> 1.1 ( 3) 1.4 ( 31)
Stroke_Dx % (freq) Missing 0.0 ( 0) 0.0 ( 0) NaN 6.9
No stroke diagnosed 80.7 (218) 70.3 (1513)
Stroke diagnosed 12.6 ( 34) 22.8 ( 490)
<NA> 6.7 ( 18) 6.9 ( 148)
sympt % (freq) missing 0.0 ( 0) 0.0 ( 0) NaN 0.0
Asymptomatic 100.0 (270) 0.0 ( 0)
TIA 0.0 ( 0) 44.7 ( 961)
minor stroke 0.0 ( 0) 18.9 ( 407)
Major stroke 0.0 ( 0) 11.1 ( 238)
Amaurosis fugax 0.0 ( 0) 17.6 ( 379)
Four vessel disease 0.0 ( 0) 1.8 ( 38)
Vertebrobasilary TIA 0.0 ( 0) 0.2 ( 5)
Retinal infarction 0.0 ( 0) 1.6 ( 34)
Symptomatic, but aspecific symtoms 0.0 ( 0) 2.5 ( 53)
Contralateral symptomatic occlusion 0.0 ( 0) 0.5 ( 11)
retinal infarction 0.0 ( 0) 0.3 ( 6)
armclaudication due to occlusion subclavian artery, CEA needed for bypass 0.0 ( 0) 0.0 ( 1)
retinal infarction + TIAs 0.0 ( 0) 0.0 ( 0)
Ocular ischemic syndrome 0.0 ( 0) 0.7 ( 16)
ischemisch glaucoom 0.0 ( 0) 0.0 ( 0)
subclavian steal syndrome 0.0 ( 0) 0.1 ( 2)
TGA 0.0 ( 0) 0.0 ( 0)
Symptoms.5G % (freq) Asymptomatic 100.0 (270) 0.0 ( 0) <0.001 0.0
Ocular 0.0 ( 0) 18.4 ( 395)
Other 0.0 ( 0) 4.9 ( 105)
Retinal infarction 0.0 ( 0) 1.9 ( 40)
Stroke 0.0 ( 0) 30.0 ( 645)
TIA 0.0 ( 0) 44.9 ( 966)
AsymptSympt % (freq) Asymptomatic 100.0 (270) 0.0 ( 0) <0.001 0.0
Ocular and others 0.0 ( 0) 25.1 ( 540)
Symptomatic 0.0 ( 0) 74.9 (1611)
AsymptSympt2G % (freq) Asymptomatic 100.0 (270) 0.0 ( 0) <0.001 0.0
Symptomatic 0.0 ( 0) 100.0 (2151)
restenos % (freq) missing 0.0 ( 0) 0.0 ( 0) NaN 1.4
de novo 91.9 (248) 93.9 (2020)
restenosis 6.3 ( 17) 4.7 ( 101)
stenose bij angioseal na PTCA 0.0 ( 0) 0.0 ( 0)
<NA> 1.9 ( 5) 1.4 ( 30)
stenose % (freq) missing 0.0 ( 0) 0.0 ( 0) NaN 2.0
0-49% 0.0 ( 0) 0.6 ( 13)
50-70% 2.6 ( 7) 8.5 ( 182)
70-90% 51.1 (138) 46.0 ( 989)
90-99% 40.7 (110) 38.0 ( 817)
100% (Occlusion) 0.7 ( 2) 1.3 ( 29)
NA 0.0 ( 0) 0.0 ( 1)
50-99% 0.4 ( 1) 0.7 ( 14)
70-99% 1.9 ( 5) 2.9 ( 63)
99 0.0 ( 0) 0.1 ( 2)
<NA> 2.6 ( 7) 1.9 ( 41)
CAD_history % (freq) Missing 0.0 ( 0) 0.0 ( 0) NaN 1.9
No history CAD 57.8 (156) 68.0 (1462)
History CAD 41.1 (111) 30.0 ( 645)
<NA> 1.1 ( 3) 2.0 ( 44)
PAOD % (freq) missing/no data 0.0 ( 0) 0.0 ( 0) NaN 2.0
no 71.9 (194) 78.2 (1682)
yes 27.0 ( 73) 19.7 ( 424)
<NA> 1.1 ( 3) 2.1 ( 45)
Peripheral.interv % (freq) no 66.3 (179) 78.5 (1689) <0.001 2.9
yes 31.5 ( 85) 18.5 ( 397)
<NA> 2.2 ( 6) 3.0 ( 65)
EP_composite % (freq) No data available. 0.0 ( 0) 0.0 ( 0) NaN 5.0
No composite endpoints 66.3 (179) 71.1 (1530)
Composite endpoints 30.7 ( 83) 23.6 ( 507)
<NA> 3.0 ( 8) 5.3 ( 114)
EP_composite_time (mean (SD)) 2.444 (1.032) 2.483 (1.119) 0.590 5.2
macmean0 (mean (SD)) 0.710 (0.999) 0.775 (1.207) 0.458 29.7
smcmean0 (mean (SD)) 2.395 (2.486) 1.928 (2.362) 0.009 29.9
Macrophages.bin % (freq) no/minor 40.0 (108) 34.3 ( 738) 0.159 24.1
moderate/heavy 38.9 (105) 41.2 ( 886)
<NA> 21.1 ( 57) 24.5 ( 527)
SMC.bin % (freq) no/minor 18.9 ( 51) 25.6 ( 551) 0.004 23.8
moderate/heavy 60.7 (164) 50.1 (1078)
<NA> 20.4 ( 55) 24.3 ( 522)
neutrophils (mean (SD)) 120.955 (407.051) 151.585 (422.754) 0.655 87.4
Mast_cells_plaque (mean (SD)) 133.073 (145.270) 170.896 (166.898) 0.178 90.0
IPH.bin % (freq) no 36.7 ( 99) 30.0 ( 645) 0.065 23.5
yes 43.3 (117) 46.1 ( 991)
[ reached getOption("max.print") -- omitted 23 rows ]
metadata <- scRNAseqData@meta.data %>% as_tibble()
scRNAseqDataMeta <- metadata %>% distinct(Patient, .keep_all = TRUE)
distinct: removed 6,154 rows (99%), 37 rows remaining
scRNAseqDataMetaAE <- merge(scRNAseqDataMeta, AEDB, by.x = "Patient", by.y = "STUDY_NUMBER", sort = FALSE, all.x = TRUE)
dim(scRNAseqDataMetaAE)
[1] 37 1123
# Replace missing data
# Ref: https://cran.r-project.org/web/packages/naniar/vignettes/replace-with-na.html
require(naniar)
na_strings <- c("NA", "N A", "N / A", "N/A", "N/ A",
"Not Available", "Not available",
"missing",
"-999", "-99",
"No data available/missing", "No data available/Missing")
# Then you write ~.x %in% na_strings - which reads as “does this value occur in the list of NA strings”.
scRNAseqDataMetaAE %>%
replace_with_na_all(condition = ~.x %in% na_strings)
cat("====================================================================================================")
====================================================================================================
cat("SELECTION THE SHIZZLE")
SELECTION THE SHIZZLE
cat("- sanity checking PRIOR to selection")
- sanity checking PRIOR to selection
library(data.table)
require(labelled)
ae.gender <- to_factor(scRNAseqDataMetaAE$Gender)
ae.hospital <- to_factor(scRNAseqDataMetaAE$Hospital)
table(ae.gender, ae.hospital, dnn = c("Sex", "Hospital"), useNA = "ifany")
Hospital
Sex St. Antonius, Nieuwegein UMC Utrecht <NA>
female 0 10 0
male 0 26 0
<NA> 0 0 1
ae.artery <- to_factor(scRNAseqDataMetaAE$Artery_summary)
table(ae.artery, ae.gender, dnn = c("Sex", "Artery"), useNA = "ifany")
Artery
Sex female male <NA>
No artery known (yet), no surgery (patient ill, died, exited study), re-numbered to AAA 0 0 0
carotid (left & right) 10 25 0
femoral/iliac (left, right or both sides) 0 0 0
other carotid arteries (common, external) 0 1 0
carotid bypass and injury (left, right or both sides) 0 0 0
aneurysmata (carotid & femoral) 0 0 0
aorta 0 0 0
other arteries (renal, popliteal, vertebral) 0 0 0
femoral bypass, angioseal and injury (left, right or both sides) 0 0 0
<NA> 0 0 1
ae.ic <- to_factor(scRNAseqDataMetaAE$informedconsent)
table(ae.ic, ae.gender, useNA = "ifany")
ae.gender
ae.ic female male <NA>
missing 0 0 0
no, died 0 0 0
yes 5 14 0
yes, health treatment when possible 2 7 0
yes, no health treatment 1 2 0
yes, no health treatment, no commercial business 1 2 0
yes, no tissue, no commerical business 0 0 0
yes, no tissue, no questionnaires, no medical info, no commercial business 0 0 0
yes, no questionnaires, no health treatment, no commercial business 0 0 0
yes, no questionnaires, health treatment when possible 0 0 0
yes, no tissue, no questionnaires, no health treatment, no commerical business 0 0 0
yes, no health treatment, no medical info, no commercial business 0 0 0
yes, no tissue, no questionnaires, no health treatment, no medical info, no commercial business 0 0 0
yes, no questionnaires, no health treatment 0 0 0
yes, no tissue, no health treatment 0 0 0
yes, no tissue, no questionnaires 0 0 0
yes, no tissue, health treatment when possible 0 0 0
yes, no tissue 0 0 0
yes, no commerical business 0 1 0
yes, health treatment when possible, no commercial business 0 0 0
yes, no medical info, no commercial business 0 0 0
yes, no questionnaires 0 0 0
yes, no tissue, no questionnaires, no health treatment, no medical info 0 0 0
yes, no tissue, no questionnaires, no health treatment, no commercial business 0 0 0
yes, no medical info 0 0 0
yes, no questionnaires, no commercial business 0 0 0
yes, no questionnaires, no health treatment, no medical info 0 0 0
yes, no questionnaires, health treatment when possible, no commercial business 0 0 0
yes, no health treatment, no medical info 0 0 0
no, doesn't want to 0 0 0
no, unable to sign 0 0 0
no, no reaction 0 0 0
no, lost 0 0 0
no, too old 0 0 0
yes, no medical info, health treatment when possible 1 0 0
no (never asked for IC because there was no tissue) 0 0 0
yes, no medical info, no commercial business, health treatment when possible 0 0 0
no, endpoint 0 0 0
wil niets invullen, wel alles gebruiken 0 0 0
second informed concents: yes, no commercial business 0 0 0
nooit geincludeerd 0 0 0
<NA> 0 0 1
rm(ae.gender, ae.hospital, ae.artery, ae.ic)
scRNAseqDataMetaAE.all <- subset(scRNAseqDataMetaAE,
(Artery_summary == "carotid (left & right)" | Artery_summary == "other carotid arteries (common, external)" ) & # we only want carotids
informedconsent != "missing" & # we are really strict in selecting based on 'informed consent'!
informedconsent != "no, died" &
informedconsent != "yes, no tissue, no commerical business" &
informedconsent != "yes, no tissue, no questionnaires, no medical info, no commercial business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no commerical business" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no medical info, no commercial business" &
informedconsent != "yes, no tissue, no health treatment" &
informedconsent != "yes, no tissue, no questionnaires" &
informedconsent != "yes, no tissue, health treatment when possible" &
informedconsent != "yes, no tissue" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no medical info" &
informedconsent != "yes, no tissue, no questionnaires, no health treatment, no commercial business" &
informedconsent != "no, doesn't want to" &
informedconsent != "no, unable to sign" &
informedconsent != "no, no reaction" &
informedconsent != "no, lost" &
informedconsent != "no, too old" &
informedconsent != "yes, no medical info, health treatment when possible" &
informedconsent != "no (never asked for IC because there was no tissue)" &
informedconsent != "no, endpoint" &
informedconsent != "nooit geincludeerd")
# scRNAseqDataMetaAE.all[1:10, 1:10]
dim(scRNAseqDataMetaAE.all)
[1] 35 1123
# DT::datatable(scRNAseqDataMetaAE.all)
Showing the baseline table.
cat("===========================================================================================")
===========================================================================================
cat("CREATE BASELINE TABLE")
CREATE BASELINE TABLE
# Create baseline tables
# http://rstudio-pubs-static.s3.amazonaws.com/13321_da314633db924dc78986a850813a50d5.html
scRNAseqDataMetaAE.all.tableOne = print(CreateTableOne(vars = basetable_vars,
# factorVars = basetable_bin,
strata = "AsymptSympt2G",
data = scRNAseqDataMetaAE.all, includeNA = TRUE),
nonnormal = c(),
quote = FALSE, showAllLevels = TRUE,
format = "p",
contDigits = 3)[,1:2]
These variables only have NA/NaN: hsCRP_plasma macmean0 smcmean0 Macrophages.bin SMC.bin neutrophils Mast_cells_plaque IPH.bin vessel_density_averaged Calc.bin Collagen.bin Fat.bin_10 Fat.bin_40 OverallPlaquePhenotype IL6 IL6_pg_ug_2015 IL6R_pg_ug_2015 MCP1 MCP1_pg_ug_2015 Dropped
Stratified by AsymptSympt2G
level Asymptomatic Symptomatic p test
n 6 29
Hospital (%) St. Antonius, Nieuwegein 0.0 0.0 NaN
UMC Utrecht 100.0 100.0
ORyear (%) No data available/missing 0.0 0.0 NaN
2002 0.0 0.0
2003 0.0 0.0
2004 0.0 0.0
2005 0.0 0.0
2006 0.0 0.0
2007 0.0 0.0
2008 0.0 0.0
2009 0.0 0.0
2010 0.0 0.0
2011 0.0 0.0
2012 0.0 0.0
2013 0.0 0.0
2014 0.0 0.0
2015 0.0 0.0
2016 0.0 0.0
2017 0.0 0.0
2018 66.7 65.5
2019 33.3 34.5
Age (mean (SD)) 68.333 (5.354) 73.655 (8.427) 0.149
Gender (%) female 0.0 31.0 0.285
male 100.0 69.0
TC_finalCU (mean (SD)) 148.005 (52.704) 170.849 (47.727) 0.446
LDL_finalCU (mean (SD)) 59.202 (5.898) 101.897 (37.879) 0.067
HDL_finalCU (mean (SD)) 37.066 (7.837) 44.788 (10.236) 0.223
TG_finalCU (mean (SD)) 260.767 (195.784) 159.292 (94.171) 0.137
TC_final (mean (SD)) 3.833 (1.365) 4.425 (1.236) 0.446
LDL_final (mean (SD)) 1.533 (0.153) 2.639 (0.981) 0.067
HDL_final (mean (SD)) 0.960 (0.203) 1.160 (0.265) 0.223
TG_final (mean (SD)) 2.947 (2.212) 1.800 (1.064) 0.137
systolic (mean (SD)) 150.800 (25.821) 153.690 (26.233) 0.821
diastoli (mean (SD)) 83.200 (5.762) 80.414 (17.340) 0.727
GFR_MDRD (mean (SD)) 66.437 (28.718) 85.083 (31.324) 0.333
BMI (mean (SD)) 26.412 (2.710) 26.492 (3.620) 0.960
KDOQI (%) No data available/missing 0.0 0.0 NaN
Normal kidney function 16.7 37.9
CKD 2 (Mild) 0.0 37.9
CKD 3 (Moderate) 33.3 20.7
CKD 4 (Severe) 0.0 0.0
CKD 5 (Failure) 0.0 0.0
<NA> 50.0 3.4
BMI_WHO (%) No data available/missing 0.0 0.0 NaN
Underweight 0.0 3.4
Normal 33.3 31.0
Overweight 50.0 41.4
Obese 16.7 13.8
<NA> 0.0 10.3
SmokerStatus (%) Current smoker 16.7 37.9 0.296
Ex-smoker 83.3 41.4
Never smoked 0.0 17.2
<NA> 0.0 3.4
AlcoholUse (%) No 0.0 44.8 0.066
Yes 100.0 48.3
<NA> 0.0 6.9
DiabetesStatus (%) Control (no Diabetes Dx/Med) 83.3 58.6 0.499
Diabetes 16.7 41.4
Hypertension.selfreport (%) No data available/missing 0.0 0.0 NaN
no 0.0 13.8
yes 100.0 82.8
<NA> 0.0 3.4
Hypertension.selfreportdrug (%) No data available/missing 0.0 0.0 NaN
no 16.7 10.3
yes 83.3 86.2
<NA> 0.0 3.4
Hypertension.composite (%) No data available/missing 0.0 0.0 NaN
no 0.0 6.9
yes 100.0 93.1
Hypertension.drugs (%) No data available/missing 0.0 0.0 NaN
no 16.7 3.4
yes 83.3 93.1
<NA> 0.0 3.4
Med.anticoagulants (%) No data available/missing 0.0 0.0 NaN
no 100.0 86.2
yes 0.0 6.9
<NA> 0.0 6.9
Med.all.antiplatelet (%) No data available/missing 0.0 0.0 NaN
no 0.0 31.0
yes 100.0 65.5
<NA> 0.0 3.4
Med.Statin.LLD (%) No data available/missing 0.0 0.0 NaN
no 0.0 24.1
yes 100.0 72.4
<NA> 0.0 3.4
Stroke_Dx (%) Missing 0.0 0.0 NaN
No stroke diagnosed 66.7 48.3
Stroke diagnosed 33.3 51.7
sympt (%) missing 0.0 0.0 NaN
Asymptomatic 100.0 0.0
TIA 0.0 17.2
minor stroke 0.0 41.4
Major stroke 0.0 10.3
Amaurosis fugax 0.0 17.2
Four vessel disease 0.0 0.0
Vertebrobasilary TIA 0.0 0.0
Retinal infarction 0.0 3.4
Symptomatic, but aspecific symtoms 0.0 0.0
Contralateral symptomatic occlusion 0.0 0.0
retinal infarction 0.0 3.4
armclaudication due to occlusion subclavian artery, CEA needed for bypass 0.0 0.0
retinal infarction + TIAs 0.0 0.0
Ocular ischemic syndrome 0.0 6.9
ischemisch glaucoom 0.0 0.0
subclavian steal syndrome 0.0 0.0
TGA 0.0 0.0
Symptoms.5G (%) Asymptomatic 100.0 0.0 NaN
Ocular 0.0 24.1
Other 0.0 0.0
Retinal infarction 0.0 6.9
Stroke 0.0 51.7
TIA 0.0 17.2
AsymptSympt (%) Asymptomatic 100.0 0.0 <0.001
Ocular and others 0.0 31.0
Symptomatic 0.0 69.0
AsymptSympt2G (%) Asymptomatic 100.0 0.0 <0.001
Symptomatic 0.0 100.0
restenos (%) missing 0.0 0.0 NaN
de novo 100.0 100.0
restenosis 0.0 0.0
stenose bij angioseal na PTCA 0.0 0.0
stenose (%) missing 0.0 0.0 NaN
0-49% 0.0 3.4
50-70% 16.7 17.2
70-90% 16.7 48.3
90-99% 33.3 13.8
100% (Occlusion) 0.0 0.0
NA 0.0 0.0
50-99% 0.0 0.0
70-99% 33.3 17.2
99 0.0 0.0
CAD_history (%) Missing 0.0 0.0 NaN
No history CAD 100.0 69.0
History CAD 0.0 31.0
PAOD (%) missing/no data 0.0 0.0 NaN
no 50.0 93.1
yes 50.0 6.9
Peripheral.interv (%) no 50.0 82.8 0.125
yes 50.0 13.8
<NA> 0.0 3.4
EP_composite (%) No data available. 0.0 0.0 NaN
No composite endpoints 66.7 37.9
Composite endpoints 0.0 13.8
<NA> 33.3 48.3
EP_composite_time (mean (SD)) 1.326 (0.557) 0.826 (0.421) 0.064
Writing the baseline table to Excel format.
# Write basetable
require(openxlsx)
write.xlsx(file = paste0(OUT_loc, "/",Today,".",PROJECTNAME,".AE.BaselineTable.scRNAseq.xlsx"),
format(scRNAseqDataMetaAE.all.tableOne, digits = 5, scientific = FALSE) , row.names = TRUE, col.names = TRUE)
Here review the number of cells per sample, plate, and patients. And plot the ratio’s per sample and study number.
## check stuff
cat("\nHow many cells per type ...?")
How many cells per type ...?
sort(table(scRNAseqData@meta.data$SCT_snn_res.0.8))
17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
31 34 84 110 151 172 190 203 211 225 290 345 437 534 577 626 861 1110
cat("\n\nHow many cells per plate ...?")
How many cells per plate ...?
sort(table(scRNAseqData@meta.data$ID))
4530.P1 4440.P1 4472.P1 4478.P1 4477.P1 4500.P1 4458.P1 4459.P1 4447.P2 4447.P3 4487.P2 4502.P1 4455.P1 4496.P1 4501.P1 4447.P1 4489.P1 4476.P1 4448.P1 4487.P1 4571.P1 4495.P1
4 11 32 40 41 45 47 48 51 66 75 80 82 88 93 96 102 104 105 112 112 115
4432.P1 4520.P1 4450.P2 4545.P1 4513.P1 4452.P3 4453.P3 4452.P2 4450.P1 4558.P1 4535.P1 4488.P1 4480.P1 4470.P1 4450.P3 4453.P1 4486.P1 4452.P1 4546.P1 4443.P2 4491.P1 4453.P2
129 130 134 135 139 141 143 155 157 157 158 159 161 165 166 177 179 183 188 189 193 197
4530.P2 4443.P1 4521.P2 4443.P3 4542.P1
207 209 212 239 240
cat("\n\nHow many cells per type per plate ...?")
How many cells per type per plate ...?
table(scRNAseqData@meta.data$SCT_snn_res.0.8, scRNAseqData@meta.data$ID)
4432.P1 4440.P1 4443.P1 4443.P2 4443.P3 4447.P1 4447.P2 4447.P3 4448.P1 4450.P1 4450.P2 4450.P3 4452.P1 4452.P2 4452.P3 4453.P1 4453.P2 4453.P3 4455.P1 4458.P1 4459.P1
0 28 2 31 56 53 8 7 12 22 15 22 22 5 15 21 19 20 23 10 5 2
1 23 0 0 1 0 0 0 0 27 13 10 11 7 17 14 18 27 17 9 2 9
2 18 1 15 11 8 9 5 4 11 20 23 25 4 8 8 7 4 6 19 8 9
3 2 0 3 4 4 4 3 8 0 27 20 50 134 48 32 42 78 12 2 2 0
4 14 2 5 7 5 2 3 2 6 12 5 5 7 10 6 33 15 19 10 1 2
5 11 3 21 22 28 5 6 5 7 9 4 5 5 6 4 17 14 14 5 6 6
6 1 1 66 46 73 15 1 4 3 4 7 6 4 10 13 18 10 26 3 1 3
7 4 0 14 6 4 8 4 6 3 31 24 10 5 9 14 4 2 5 0 2 1
8 9 0 10 2 5 4 3 7 4 2 5 4 0 2 2 1 3 1 4 2 0
9 3 1 3 2 2 8 5 8 8 14 4 8 2 3 2 0 0 2 1 8 0
10 7 0 5 6 1 2 2 0 2 2 3 6 3 4 2 1 4 2 7 2 2
11 0 1 7 5 2 23 11 8 0 0 5 3 0 8 7 1 2 5 3 2 6
12 4 0 8 7 11 0 0 0 9 0 0 5 3 6 6 2 11 3 3 0 4
13 1 0 20 7 40 6 0 0 1 0 1 1 0 0 1 0 1 0 2 1 1
14 1 0 0 1 1 1 0 1 1 5 1 3 1 2 6 4 3 1 2 3 0
15 0 0 1 5 1 1 1 1 0 1 0 1 3 6 2 4 3 3 1 2 1
16 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 2 0 3 0 0 2
17 2 0 0 1 0 0 0 0 1 2 0 1 0 1 0 4 0 1 1 0 0
4470.P1 4472.P1 4476.P1 4477.P1 4478.P1 4480.P1 4486.P1 4487.P1 4487.P2 4488.P1 4489.P1 4491.P1 4495.P1 4496.P1 4500.P1 4501.P1 4502.P1 4513.P1 4520.P1 4521.P2 4530.P1
0 44 1 10 11 7 15 38 15 5 28 30 38 12 7 9 28 25 29 23 43 0
1 25 5 21 12 7 51 21 27 21 44 20 51 29 9 4 28 17 24 20 48 1
2 16 10 8 2 6 16 31 18 11 10 4 11 14 49 5 10 1 9 13 13 1
3 2 0 7 0 1 3 7 0 14 1 0 1 2 3 6 3 2 2 7 14 1
4 23 0 7 2 1 1 7 1 2 4 11 16 15 3 3 2 17 29 8 32 0
5 5 3 13 1 5 17 5 11 4 8 4 22 14 4 8 3 5 17 7 14 1
6 20 0 0 0 0 0 1 9 0 0 0 0 0 0 0 0 0 0 0 0 0
7 7 3 3 0 1 12 19 5 0 18 4 2 3 1 1 3 0 1 1 0 0
8 4 3 3 1 4 2 12 7 2 6 9 5 6 1 0 1 2 14 19 1 0
9 0 0 1 0 1 3 14 6 1 17 4 9 1 2 0 1 1 4 7 2 0
10 1 4 4 1 0 12 3 2 1 2 4 18 7 3 0 7 1 4 3 4 0
11 7 1 10 5 1 9 8 0 11 2 3 2 5 0 2 5 2 1 7 5 0
12 7 1 2 5 4 8 4 4 1 16 2 9 1 0 0 0 7 4 0 6 0
13 2 0 1 0 0 0 1 3 0 1 0 3 0 1 0 0 0 0 1 20 0
14 0 0 9 0 1 2 0 1 1 1 3 1 4 1 5 0 0 1 11 2 0
15 2 1 2 0 1 8 6 2 1 0 1 4 1 2 0 1 0 0 1 6 0
16 0 0 2 0 0 2 2 0 0 1 0 0 1 1 2 1 0 0 0 1 0
17 0 0 1 1 0 0 0 1 0 0 3 1 0 1 0 0 0 0 2 1 0
4530.P2 4535.P1 4542.P1 4545.P1 4546.P1 4558.P1 4571.P1
0 29 32 78 57 44 44 10
1 12 24 41 17 38 28 11
2 72 12 11 1 8 11 30
3 9 1 2 6 3 3 2
4 17 45 13 14 64 22 4
5 18 12 12 3 4 3 11
6 0 0 0 0 0 0 0
7 3 1 23 4 4 7 8
8 25 2 5 8 1 3 9
9 2 0 37 5 2 5 2
10 4 12 5 6 12 19 1
11 0 2 1 1 1 0 0
12 0 4 1 1 2 1 0
13 5 0 3 5 1 6 15
14 6 7 2 6 3 2 4
15 0 2 2 0 1 2 1
16 1 2 3 1 0 0 4
17 4 0 1 0 0 1 0
cat("\n\nHow many cells per patient ...?")
How many cells per patient ...?
sort(table(scRNAseqData@meta.data$Patient))
4440 4472 4478 4477 4500 4458 4459 4502 4455 4496 4501 4489 4476 4448 4571 4495 4432 4520 4545 4513 4558 4535 4488 4480 4470 4486 4487 4546 4491 4530 4521 4447 4542 4450 4452
11 32 40 41 45 47 48 80 82 88 93 102 104 105 112 115 129 130 135 139 157 158 159 161 165 179 187 188 193 211 212 213 240 457 479
4453 4443
517 637
cat("\n\nVisualizing these ratio's per study number and sample ...?")
Visualizing these ratio's per study number and sample ...?
UMAPPlot(scRNAseqData, label = TRUE, pt.size = 1.25, label.size = 4, group.by = "ident",
repel = TRUE)
ggsave(paste0(PLOT_loc, "/", Today, ".UMAP.png"), plot = last_plot())
Saving 7.29 x 4.51 in image
ggsave(paste0(PLOT_loc, "/", Today, ".UMAP.ps"), plot = last_plot())
Saving 7.29 x 4.51 in image
barplot(prop.table(x = table(scRNAseqData@active.ident, scRNAseqData@meta.data$Patient)),
cex.axis = 1.0, cex.names = 0.5, las = 1,
col = uithof_color, xlab = "study number", legend.text = FALSE, args.legend = list(x = "bottom"))
dev.copy(pdf, paste0(QC_loc, "/", Today, ".cell_ratios_per_sample.pdf"))
pdf
3
dev.off()
quartz_off_screen
2
barplot(prop.table(x = table(scRNAseqData@active.ident, scRNAseqData@meta.data$ID)),
cex.axis = 1.0, cex.names = 0.5, las = 2,
col = uithof_color, xlab = "sample ID", legend.text = FALSE, args.legend = list(x = "bottom"))
dev.copy(pdf, paste0(QC_loc, "/", Today, ".cell_ratios_per_sample_per_plate.pdf"))
pdf
3
dev.off()
quartz_off_screen
2
Let’s project known cellular markers.
UMAPPlot(scRNAseqData, label = FALSE, pt.size = 1.25, label.size = 4, group.by = "ident",
repel = TRUE)
# endothelial cells
FeaturePlot(scRNAseqData, features = c("CD34"), cols = c("#ECECEC", "#DB003F"))
FeaturePlot(scRNAseqData, features = c("EDN1"), cols = c("#ECECEC", "#DB003F"))
FeaturePlot(scRNAseqData, features = c("EDNRA", "EDNRB"), cols = c("#ECECEC", "#DB003F"))
FeaturePlot(scRNAseqData, features = c("CDH5", "PECAM1"), cols = c("#ECECEC", "#DB003F"))
FeaturePlot(scRNAseqData, features = c("ACKR1"), cols = c("#ECECEC", "#DB003F"))
# SMC
FeaturePlot(scRNAseqData, features = c("MYH11"), cols = c("#ECECEC", "#DB003F"))
FeaturePlot(scRNAseqData, features = c("LGALS3", "ACTA2"), cols = c("#ECECEC", "#DB003F"))
# macrophages
FeaturePlot(scRNAseqData, features = c("CD14", "CD68"), cols = c("#ECECEC", "#DB003F"))
FeaturePlot(scRNAseqData, features = c("CD36"), cols = c("#ECECEC", "#DB003F"))
# t-cells
FeaturePlot(scRNAseqData, features = c("CD3E"), cols = c("#ECECEC", "#DB003F"))
FeaturePlot(scRNAseqData, features = c("CD4"), cols = c("#ECECEC", "#DB003F"))
# FeaturePlot(scRNAseqData, features = c("CD8"), cols = c("#ECECEC", "#DB003F"))
# b-cells
FeaturePlot(scRNAseqData, features = c("CD79A"), cols = c("#ECECEC", "#DB003F"))
# mast cells
FeaturePlot(scRNAseqData, features = c("KIT"), cols = c("#ECECEC", "#DB003F"))
# NK cells
FeaturePlot(scRNAseqData, features = c("NCAM1"), cols = c("#ECECEC", "#DB003F"))
We check whether the targets genes, MCP1, CCR2, IL6, IL6R, were sequenced using our method (STARseq). Given that COL4A1/2 are downstream targets of MMP2 we also checked the cell type specific expression of these genes.
UMAPPlot(scRNAseqData, label = FALSE, pt.size = 1.25, label.size = 4, group.by = "ident",
repel = TRUE)
VlnPlot(scRNAseqData, features = TARGET_A_alt) +
xlab("cell communities") +
ylab(bquote("normalized "~italic(CCL2)~" expression")) +
theme(
axis.title.x = element_text(color = "#000000", size = 14, face = "bold"),
axis.title.y = element_text(color = "#000000", size = 14, face = "bold"),
legend.position = "none")
ggsave(paste0(PLOT_loc, "/", Today, ".VlnPlot.",TARGET_A,".png"), plot = last_plot())
Saving 7.29 x 4.51 in image
ggsave(paste0(PLOT_loc, "/", Today, ".VlnPlot.",TARGET_A,".ps"), plot = last_plot())
Saving 7.29 x 4.51 in image
VlnPlot(scRNAseqData, features = TARGET_B) +
xlab("cell communities") +
ylab(bquote("normalized "~italic(CCR2)~" expression")) +
theme(
axis.title.x = element_text(color = "#000000", size = 14, face = "bold"),
axis.title.y = element_text(color = "#000000", size = 14, face = "bold"),
legend.position = "none")
ggsave(paste0(PLOT_loc, "/", Today, ".VlnPlot.",TARGET_B,".png"), plot = last_plot())
Saving 7.29 x 4.51 in image
ggsave(paste0(PLOT_loc, "/", Today, ".VlnPlot.",TARGET_B,".ps"), plot = last_plot())
Saving 7.29 x 4.51 in image
# VlnPlot(scRNAseqData, features = TARGET_C) +
# xlab("cell communities") +
# ylab(bquote("normalized "~italic(IL6)~" expression")) +
# theme(
# axis.title.x = element_text(color = "#000000", size = 14, face = "bold"),
# axis.title.y = element_text(color = "#000000", size = 14, face = "bold"),
# legend.position = "none")
# ggsave(paste0(PLOT_loc, "/", Today, ".VlnPlot.",TARGET_C,".png"), plot = last_plot())
# ggsave(paste0(PLOT_loc, "/", Today, ".VlnPlot.",TARGET_C,".ps"), plot = last_plot())
#
# VlnPlot(scRNAseqData, features = TARGET_D) +
# xlab("cell communities") +
# ylab(bquote("normalized "~italic(IL6R)~" expression")) +
# theme(
# axis.title.x = element_text(color = "#000000", size = 14, face = "bold"),
# axis.title.y = element_text(color = "#000000", size = 14, face = "bold"),
# legend.position = "none")
# ggsave(paste0(PLOT_loc, "/", Today, ".VlnPlot.",TARGET_D,".png"), plot = last_plot())
# ggsave(paste0(PLOT_loc, "/", Today, ".VlnPlot.",TARGET_D,".ps"), plot = last_plot())
# GenesOfInterest_TargetsA <- c("CCL2", "CCR2", "IL6", "IL6R", "ACTA2", "MYH11", "CD34", "CD14", "CD68", "CD3E", "CD4")
# GenesOfInterest_TargetsA <- c("CCL2", "CCR2", "IL6", "IL6R")
GenesOfInterest_TargetsA <- c("CCL2", "CCR2")
library(RColorBrewer)
p1 <- DotPlot(scRNAseqData, features = GenesOfInterest_TargetsA,
cols = "RdBu")
p1 + theme(axis.text.x = element_text(angle = 45, hjust=1))
ggsave(paste0(PLOT_loc, "/", Today, ".DotPlot.Targets.png"), plot = last_plot())
Saving 7.29 x 4.51 in image
ggsave(paste0(PLOT_loc, "/", Today, ".DotPlot.Targets.ps"), plot = last_plot())
Saving 7.29 x 4.51 in image
rm(p1)
# Visualize co-expression of two features simultaneously
FeaturePlot(scRNAseqData, features = c(TARGET_A_alt, "CD14"),
# cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# combine = TRUE,
blend = TRUE)
FeaturePlot(scRNAseqData, features = c(TARGET_A_alt, "CD68"),
# cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# combine = TRUE,
blend = TRUE)
FeaturePlot(scRNAseqData, features = c(TARGET_B, "CD14"),
# cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# combine = TRUE,
blend = TRUE)
FeaturePlot(scRNAseqData, features = c(TARGET_B, "CD68"),
# cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# combine = TRUE,
blend = TRUE)
FeaturePlot(scRNAseqData, features = c(TARGET_B, "CD79A"),
# cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# combine = TRUE,
blend = TRUE)
FeaturePlot(scRNAseqData, features = c(TARGET_B, "CD79A"),
# cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# combine = TRUE,
blend = TRUE)
# FeaturePlot(scRNAseqData, features = c(TARGET_D, "CD79A"),
# # cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# # combine = TRUE,
# blend = TRUE)
# FeaturePlot(scRNAseqData, features = c(TARGET_D, "CD79A"),
# # cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
# # combine = TRUE,
# blend = TRUE)
FeaturePlot(scRNAseqData, features = c(GenesOfInterest_TargetsA),
cols = c("#ECECEC", "#DB003F", "#9A3480","#1290D9"),
combine = TRUE)
ggsave(paste0(PLOT_loc, "/", Today, ".FeaturePlot.Targets.png"), plot = last_plot())
Saving 7.29 x 4.51 in image
ggsave(paste0(PLOT_loc, "/", Today, ".FeaturePlot.Targets.ps"), plot = last_plot())
Saving 7.29 x 4.51 in image
FeaturePlot(scRNAseqData, features = c(TARGET_A_alt), cols = c("#ECECEC", "#DB003F"))
FeaturePlot(scRNAseqData, features = c(TARGET_B), cols = c("#ECECEC", "#DB003F"))
# FeaturePlot(scRNAseqData, features = c(TARGET_C), cols = c("#ECECEC", "#DB003F"))
# FeaturePlot(scRNAseqData, features = c(TARGET_D), cols = c("#ECECEC", "#DB003F"))
Comparison between the macrophages cell communities (CD14/CD68+), and all other communities.
N_GENES=20552
MAC.markers <- FindMarkers(object = scRNAseqData,
ident.1 = c("CD14+CD68+ M I",
"CD14+CD68+ M II",
"CD14+CD68+ M III"),
ident.2 = c(#"CD14+CD68+ M I",
#"CD14+CD68+ M II",
#"CD14+CD68+ M III",
"CD3+CD8+ T I",
"CD3+CD8A+ T II ",
"CD3+CD8 T III",
"CD3+CD4+ T I",
"CD3+CD4+ T II",
"CD3+CD4+ T III",
"CD34+ EC I", "CD34+ EC II",
"Mixed I",
"Mixed II",
"ACTA2+ SMC",
"NCAM1+ NK",
"KIT+ MC",
"CD79A+ B I",
"CD79A+ B II"))
| | 0 % ~calculating
|+ | 1 % ~29s
|++ | 2 % ~30s
|++ | 3 % ~29s
|+++ | 4 % ~01m 13s
|+++ | 5 % ~01m 04s
|++++ | 6 % ~57s
|++++ | 7 % ~53s
|+++++ | 8 % ~49s
|+++++ | 9 % ~47s
|++++++ | 10% ~45s
|++++++ | 11% ~42s
|+++++++ | 12% ~41s
|+++++++ | 13% ~39s
|++++++++ | 14% ~38s
|++++++++ | 15% ~37s
|+++++++++ | 16% ~36s
|+++++++++ | 17% ~35s
|++++++++++ | 18% ~35s
|++++++++++ | 19% ~34s
|+++++++++++ | 20% ~34s
|+++++++++++ | 21% ~33s
|++++++++++++ | 22% ~33s
|++++++++++++ | 23% ~32s
|+++++++++++++ | 24% ~32s
|+++++++++++++ | 26% ~31s
|++++++++++++++ | 27% ~31s
|++++++++++++++ | 28% ~31s
|+++++++++++++++ | 29% ~30s
|+++++++++++++++ | 30% ~30s
|++++++++++++++++ | 31% ~29s
|++++++++++++++++ | 32% ~28s
|+++++++++++++++++ | 33% ~27s
|+++++++++++++++++ | 34% ~27s
|++++++++++++++++++ | 35% ~26s
|++++++++++++++++++ | 36% ~26s
|+++++++++++++++++++ | 37% ~25s
|+++++++++++++++++++ | 38% ~24s
|++++++++++++++++++++ | 39% ~24s
|++++++++++++++++++++ | 40% ~23s
|+++++++++++++++++++++ | 41% ~23s
|+++++++++++++++++++++ | 42% ~22s
|++++++++++++++++++++++ | 43% ~22s
|++++++++++++++++++++++ | 44% ~21s
|+++++++++++++++++++++++ | 45% ~21s
|+++++++++++++++++++++++ | 46% ~20s
|++++++++++++++++++++++++ | 47% ~20s
|++++++++++++++++++++++++ | 48% ~20s
|+++++++++++++++++++++++++ | 49% ~19s
|+++++++++++++++++++++++++ | 50% ~19s
|++++++++++++++++++++++++++ | 51% ~18s
|+++++++++++++++++++++++++++ | 52% ~18s
|+++++++++++++++++++++++++++ | 53% ~17s
|++++++++++++++++++++++++++++ | 54% ~17s
|++++++++++++++++++++++++++++ | 55% ~16s
|+++++++++++++++++++++++++++++ | 56% ~16s
|+++++++++++++++++++++++++++++ | 57% ~16s
|++++++++++++++++++++++++++++++ | 58% ~15s
|++++++++++++++++++++++++++++++ | 59% ~15s
|+++++++++++++++++++++++++++++++ | 60% ~14s
|+++++++++++++++++++++++++++++++ | 61% ~14s
|++++++++++++++++++++++++++++++++ | 62% ~14s
|++++++++++++++++++++++++++++++++ | 63% ~13s
|+++++++++++++++++++++++++++++++++ | 64% ~13s
|+++++++++++++++++++++++++++++++++ | 65% ~12s
|++++++++++++++++++++++++++++++++++ | 66% ~12s
|++++++++++++++++++++++++++++++++++ | 67% ~12s
|+++++++++++++++++++++++++++++++++++ | 68% ~11s
|+++++++++++++++++++++++++++++++++++ | 69% ~11s
|++++++++++++++++++++++++++++++++++++ | 70% ~11s
|++++++++++++++++++++++++++++++++++++ | 71% ~10s
|+++++++++++++++++++++++++++++++++++++ | 72% ~10s
|+++++++++++++++++++++++++++++++++++++ | 73% ~10s
|++++++++++++++++++++++++++++++++++++++ | 74% ~09s
|++++++++++++++++++++++++++++++++++++++ | 76% ~09s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~08s
|+++++++++++++++++++++++++++++++++++++++ | 78% ~08s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~08s
|++++++++++++++++++++++++++++++++++++++++ | 80% ~07s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~07s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~07s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~06s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~06s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~05s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~05s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~05s
|++++++++++++++++++++++++++++++++++++++++++++ | 88% ~04s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~04s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~04s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~03s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~03s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~03s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~02s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~01s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~01s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=35s
DT::datatable(MAC.markers)
MAC_Volcano_TargetsA = EnhancedVolcano(MAC.markers,
lab = rownames(MAC.markers),
x = "avg_log2FC",
y = "p_val_adj",
selectLab = GenesOfInterest_TargetsA,
axisLabSize = 12,
xlab = "average fold-change",
title = "Macrophage markers\n(Macrophage communities vs the rest)",
titleLabSize = 14,
pCutoff = 0.05/N_GENES, # 20552 genes
FCcutoff = 1.25,
pointSize = 1.5,
labSize = 3.0,
legendLabels = c('NS','avg. fold-change','P',
'P & avg. fold-change'),
legendPosition = "right",
legendLabSize = 10,
legendIconSize = 3.0,
drawConnectors = TRUE,
widthConnectors = 0.2,
colConnectors = "#595A5C",
gridlines.major = FALSE,
gridlines.minor = FALSE)
One or more p-values is 0. Converting to 10^-1 * current lowest non-zero p-value...
MAC_Volcano_TargetsA
ggsave(paste0(PLOT_loc, "/", Today, ".Volcano.MAC.DEG.Targets.pdf"),
plot = MAC_Volcano_TargetsA)
Saving 7.29 x 4.51 in image
On average FTL, LYZ, HLA-DRA, IFI30, and TCSB among others have a higher expression in the macrophage cell communities, than in all the others.
Below the results for the TARGET_GENES.
# temp = subset(MAC.markers, (rownames(MAC.markers)=="CCL2" | rownames(MAC.markers)=="CCR2" | rownames(MAC.markers)=="IL6" | rownames(MAC.markers)=="IL6R"))
temp = subset(MAC.markers, (rownames(MAC.markers)=="CCL2" | rownames(MAC.markers)=="CCR2"))
DT::datatable(temp)
rm(temp)
We will save these results too,
MAC.markers <- data.frame(Gene = rownames(MAC.markers), MAC.markers)
head(MAC.markers)
fwrite(MAC.markers, file = paste0(OUT_loc,"/",Today,".MAC.markers.txt.gz"),
quote = FALSE, sep = "\t", na ="", dec = ".",
showProgress = TRUE, verbose = FALSE,
compress = c("gzip"))
Version: v1.0.1
Last update: 2021-02-09
Written by: Sander W. van der Laan (s.w.vanderlaan-2[at]umcutrecht.nl).
Description: Script to load single-cell RNA sequencing (scRNAseq) data, and perform quality control (QC), and initial mapping to cells.
Minimum requirements: R version 3.5.2 (2018-12-20) -- 'Eggshell Igloo', macOS Mojave (10.14.2).
Change log
* v1.0.1 Update to reviewer comments and changes in EnhancedVolcano.
* v1.0.0 Initial version
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats4 tools stats graphics grDevices utils datasets methods base
other attached packages:
[1] RColorBrewer_1.1-2 SeuratObject_4.0.0 Seurat_4.0.0 EnhancedVolcano_1.8.0 ggrepel_0.9.1 mygene_1.26.0
[7] GenomicFeatures_1.42.1 GenomicRanges_1.42.0 GenomeInfoDb_1.26.2 org.Hs.eg.db_3.12.0 AnnotationDbi_1.52.0 IRanges_2.24.1
[13] S4Vectors_0.28.1 Biobase_2.50.0 BiocGenerics_0.36.0 tidylog_1.0.2 ggsci_2.9 GGally_2.1.0
[19] PerformanceAnalytics_2.0.4 xts_0.12.1 zoo_1.8-8 ggcorrplot_0.1.3.999 Hmisc_4.4-2 Formula_1.2-4
[25] lattice_0.20-41 survminer_0.4.8 survival_3.2-7 patchwork_1.1.1 openxlsx_4.2.3 ggpubr_0.4.0
[31] tableone_0.12.0 labelled_2.7.0 sjPlot_2.8.7 sjlabelled_1.1.7 haven_2.3.1 devtools_2.3.2
[37] usethis_2.0.0 MASS_7.3-53 DT_0.17 knitr_1.31 forcats_0.5.1 stringr_1.4.0
[43] purrr_0.3.4 tibble_3.0.6 ggplot2_3.3.3 tidyverse_1.3.0 data.table_1.13.6 naniar_0.6.0
[49] tidyr_1.1.2 dplyr_1.0.4 optparse_1.6.6 readr_1.4.0
loaded via a namespace (and not attached):
[1] ica_1.0-2 Rsamtools_2.6.0 class_7.3-18 ps_1.5.0 lmtest_0.9-38 rprojroot_2.0.2
[7] crayon_1.4.1 nlme_3.1-152 backports_1.2.1 reprex_1.0.0 rlang_0.4.10 XVector_0.30.0
[13] ROCR_1.0-11 readxl_1.3.1 performance_0.7.0 irlba_2.3.3 nloptr_1.2.2.2 extrafontdb_1.0
[19] callr_3.5.1 proto_1.0.0 BiocParallel_1.24.1 extrafont_0.17 bit64_4.0.5 glue_1.4.2
[25] sctransform_0.3.2 processx_3.4.5 vipor_0.4.5 SummarizedExperiment_1.20.0 tidyselect_1.1.0 km.ci_0.5-2
[31] rio_0.5.16 fitdistrplus_1.1-3 XML_3.99-0.5 proj4_1.0-10.1 GenomicAlignments_1.26.0 sjmisc_2.8.6
[37] chron_2.3-56 xtable_1.8-4 magrittr_2.0.1 evaluate_0.14 cli_2.3.0 zlibbioc_1.36.0
[43] rstudioapi_0.13 miniUI_0.1.1.1 rpart_4.1-15 tinytex_0.29 maps_3.3.0 shiny_1.6.0
[49] xfun_0.20 askpass_1.1 parameters_0.11.0 pkgbuild_1.2.0 cluster_2.1.0 listenv_0.8.0
[55] Biostrings_2.58.0 png_0.1-7 reshape_0.8.8 future_1.21.0 withr_2.4.1 bitops_1.0-6
[61] plyr_1.8.6 cellranger_1.1.0 e1071_1.7-4 survey_4.0 coda_0.19-4 pillar_1.4.7
[67] cachem_1.0.3 multcomp_1.4-16 fs_1.5.0 vctrs_0.3.6 ellipsis_0.3.1 generics_0.1.0
[73] gsubfn_0.7 foreign_0.8-81 beeswarm_0.2.3 munsell_0.5.0 DelayedArray_0.16.1 emmeans_1.5.4
[79] rtracklayer_1.50.0 fastmap_1.1.0 compiler_4.0.3 pkgload_1.1.0 abind_1.4-5 httpuv_1.5.5
[85] sessioninfo_1.1.1 plotly_4.9.3 GenomeInfoDbData_1.2.4 gridExtra_2.3 deldir_0.2-9 later_1.1.0.1
[91] BiocFileCache_1.14.0 jsonlite_1.7.2 scales_1.1.1 pbapply_1.4-3 carData_3.0-4 estimability_1.3
[97] lazyeval_0.2.2 promises_1.1.1 spatstat_1.64-1 car_3.0-10 latticeExtra_0.6-29 goftest_1.2-2
[103] spatstat.utils_2.0-0 reticulate_1.18 effectsize_0.4.3 checkmate_2.0.0 rmarkdown_2.6 ash_1.0-15
[109] sandwich_3.0-0 cowplot_1.1.1 statmod_1.4.35 Rtsne_0.15 uwot_0.1.10 igraph_1.2.6
[115] yaml_2.2.1 htmltools_0.5.1.1 memoise_2.0.0 quadprog_1.5-8 viridisLite_0.3.0 digest_0.6.27
[121] assertthat_0.2.1 mime_0.9 rappdirs_0.3.3 Rttf2pt1_1.3.8 bayestestR_0.8.2 KMsurv_0.1-5
[127] RSQLite_2.2.3 sqldf_0.4-11 future.apply_1.7.0 remotes_2.2.0 blob_1.2.1 survMisc_0.5.5
[133] splines_4.0.3 labeling_0.4.2 RCurl_1.98-1.2 broom_0.7.4 hms_1.0.0 modelr_0.1.8
[139] colorspace_2.0-0 base64enc_0.1-3 ggbeeswarm_0.6.0 ggrastr_0.2.1 nnet_7.3-15 Rcpp_1.0.6
[145] RANN_2.6.1 mvtnorm_1.1-1 clisymbols_1.2.0 parallelly_1.23.0 R6_2.5.0 grid_4.0.3
[151] ggridges_0.5.3 lifecycle_0.2.0 zip_2.1.1 curl_4.3 ggsignif_0.6.0 minqa_1.2.4
[157] leiden_0.3.7 testthat_3.0.1 getopt_1.20.3 Matrix_1.3-2 desc_1.2.0 RcppAnnoy_0.0.18
[163] TH.data_1.0-10 htmlwidgets_1.5.3 polyclip_1.10-0 biomaRt_2.46.2 crosstalk_1.1.1 rvest_0.3.6
[169] mgcv_1.8-33 globals_0.14.0 openssl_1.4.3 insight_0.12.0 htmlTable_2.1.0 codetools_0.2-18
[175] matrixStats_0.58.0 lubridate_1.7.9.2 prettyunits_1.1.1 dbplyr_2.1.0 gtable_0.3.0 DBI_1.1.1
[181] visdat_0.5.3 tensor_1.5 httr_1.4.2 KernSmooth_2.23-18 stringi_1.5.3 progress_1.2.2
[187] reshape2_1.4.4 farver_2.0.3 xml2_1.3.2 boot_1.3-26 ggeffects_1.0.1 ggalt_0.4.0
[193] lme4_1.1-26 scattermore_0.7 bit_4.0.4 MatrixGenerics_1.2.1 sjstats_0.18.1 jpeg_0.1-8.1
[199] spatstat.data_1.7-0 pkgconfig_2.0.3 rstatix_0.6.0 mitools_2.4
save.image(paste0(PROJECT_loc, "/",Today,".",PROJECTNAME,".scrnaseq_results.RData"))
| © 1979-2021 Sander W. van der Laan | s.w.vanderlaan-2[at]umcutrecht.nl | swvanderlaan.github.io. |
Georgakis MK, Gill D, Rannikmae K, Traylor M, Anderson CD, Lee JM, Kamatani Y, Hopewell JC, Worrall BB, Bernhagen J, Sudlow CLM, Malik R, Dichgans M. Genetically determined levels of circulating cytokines and risk of stroke. Circulation. 2019;139:256-268↩︎
Georgakis MK, Gill D, Rannikmae K, Traylor M, Anderson CD, Lee JM, Kamatani Y, Hopewell JC, Worrall BB, Bernhagen J, Sudlow CLM, Malik R, Dichgans M. Genetically determined levels of circulating cytokines and risk of stroke. Circulation. 2019;139:256-268↩︎
Georgakis MK, Malik R, Bjorkbacka H, Pana TA, Demissie S, Ayers C, Elhadad MA, Fornage M, Beiser AS, Benjamin EJ, Boekholdt MS, Engstrom G, Herder C, Hoogeveen RC, Koenig W, Melander O, Orho-Melander M, Schiopu A, Soderholm M, Wareham N, Ballantyne CM, Peters A, Seshadri S, Myint PK, Nilsson J, de Lemos JA, Dichgans M. Circulating monocyte chemoattractant protein-1 and risk of stroke: Meta-analysis of population-based studies involving 17 180 individuals. Circ Res. 2019;125:773-782↩︎
Nelken NA, Coughlin SR, Gordon D, Wilcox JN. Monocyte chemoattractant protein-1 in human atheromatous plaques. J Clin Invest. 1991;88:1121-1127↩︎
Papadopoulou C, Corrigall V, Taylor PR, Poston RN. The role of the chemokines MCP-1, GRO-alpha, IL-8 and their receptors in the adhesion of monocytic cells to human atherosclerotic plaques. Cytokine. 2008;43:181-186↩︎
Takeya M, Yoshimura T, Leonard EJ, Takahashi K. Detection of monocyte chemoattractant protein-1 in human atherosclerotic lesions by an anti-monocyte chemoattractant protein-1 monoclonal antibody. Hum Pathol. 1993;24:534-539↩︎
Wilcox JN, Nelken NA, Coughlin SR, Gordon D, Schall TJ. Local expression of inflammatory cytokines in human atherosclerotic plaques. J Atheroscler Thromb. 1994;1 Suppl 1:S10-13↩︎